Search CORE

5 research outputs found

Testing the Accuracy of Query Optimizers

Author: Florian M Waas
Mohamed A Soliman
Zhongxian Gu
Publication venue
Publication date: 01/01/2012
Field of study

ABSTRACT The accuracy of a query optimizer is intricately connected with a database system performance and its operational cost: the more accurate the optimizer's cost model, the better the resulting execution plans. Database application programmers and other practitioners have long provided anecdotal evidence that database systems differ widely with respect to the quality of their optimizers, yet, to date no formal method is available to database users to assess or refute such claims. In this paper, we develop a framework to quantify an optimizer's accuracy for a given workload. We make use of the fact that optimizers expose switches or hints that let users influence the plan choice and generate plans other than the default plan. Using these implements, we force the generation of multiple alternative plans for each test case, time the execution of all alternatives and rank the plans by their effective costs. We compare this ranking with the ranking of the estimated cost and compute a score for the accuracy of the optimizer. We present initial results of an anonymized comparisons for several major commercial database systems demonstrating that there are in fact substantial differences between systems. We also suggest ways to incorporate this knowledge into the commercial development process

CiteSeerX

Memory Aware Query Scheduling in a Database Cluster

Author: Florian Waas
Florian Waas
M. L. Kersten
Martin L. Kersten
Publication venue
Publication date
Field of study

Query throughput is one of the primary optimization goals in interactive web-based information systems in order to achieve the performance necessary to serve large user communities. Queries in this application domain dier signicantly from those in traditional database applications: they are of lower complexity and almost exclusively read-only. The architecture we propose here is specically tailored to take advantage of the query characteristics. It is based on a large parallel shared-nothing database cluster where each node runs a separate server with a fully replicated copy of the database. A query is assigned and entirely executed on one single node avoiding network contention or synchronization eects. However, the actual key to enhanced throughput is a resource ecient scheduling of the arriving queries. We develop a simple and robust scheduling scheme that takes the currently memory resident data at each server into account and trades o memory re-use and execution time,..

CiteSeerX

ABSTRACT On Optimal Pipeline Processing in Parallel Query Execution

Author: Copyright Stichting
F. Waas
Florian Waas
M. L. Kersten
Martin L. Kersten
Mathematisch Centrum
S. Manegold
Stefan Manegold
Publication venue
Publication date
Field of study

and their applications. SMC is sponsored by the Netherlands Organization for Scientific Research (NWO). CWI is a member o

CiteSeerX